The Statistic as a Mapping
A statistic is formally defined as a function $h: \mathbb{R}^n \to \mathbb{R}$. We define the probability of the statistic falling into a set $B$ using the pre-image:
$$h^{-1} B = \{(x_1, x_2, \dots, x_n) : h(x_1, x_2, \dots, x_n) \in B\}$$
The I.I.D. Foundation
For a sample of i.i.d. (independent and identically distributed) random variables, the joint probability of a specific sample point $(x_1, \dots, x_n)$ is the product of their marginal probabilities: $p(x_1)p(x_2)\dots p(x_n)$. This product serves as the weight for each point when calculating the total probability of the statistic taking a specific value.
Consider a discrete population where $p_X(1) = 1/2$, $p_X(2) = 1/4$, and $p_X(3) = 1/4$. We draw a sample of size $n=2$ ($X_1, X_2$) and define our statistic as the geometric mean: $Y_2 = (X_1 X_2)^{1/2}$.
To find the distribution of $Y_2$, we list all 9 possible pairs $(X_1, X_2)$, calculate their joint probability, and the resulting $Y_2$:
| Pair $(x_1, x_2)$ | Prob $P(x_1)P(x_2)$ | $Y = \sqrt{x_1 x_2}$ |
|---|---|---|
| (1, 1) | 1/4 | 1.000 |
| (1, 2), (2, 1) | 1/8 + 1/8 = 1/4 | 1.414 |
| (1, 3), (3, 1) | 1/8 + 1/8 = 1/4 | 1.732 |
| (2, 2) | 1/16 | 2.000 |
| (2, 3), (3, 2) | 1/16 + 1/16 = 1/8 | 2.449 |
| (3, 3) | 1/16 | 3.000 |
Exact vs. Asymptotic Distributions
Before moving to limit theorems like the Central Limit Theorem (CLT), we must master the "Exact Distribution." This involves calculating the specific probability mass or density function for a statistic given a small, finite $n$. When the analytic form becomes intractable, we resort to numerical simulations like **Monte Carlo approximations**.